Overview

Dataset statistics

Number of variables34
Number of observations7032
Missing cells5163
Missing cells (%)2.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.8 MiB
Average record size in memory272.0 B

Variable types

Categorical14
Numeric8
Boolean12

Alerts

Count has constant value "1" Constant
Country has constant value "United States" Constant
State has constant value "California" Constant
CustomerID has a high cardinality: 7032 distinct values High cardinality
City has a high cardinality: 1129 distinct values High cardinality
Lat Long has a high cardinality: 1652 distinct values High cardinality
Zip Code is highly correlated with Latitude and 1 other fieldsHigh correlation
Latitude is highly correlated with Zip Code and 1 other fieldsHigh correlation
Longitude is highly correlated with Zip Code and 1 other fieldsHigh correlation
Tenure Months is highly correlated with Total ChargesHigh correlation
Monthly Charges is highly correlated with Total ChargesHigh correlation
Total Charges is highly correlated with Tenure Months and 1 other fieldsHigh correlation
Churn Value is highly correlated with Churn Score and 1 other fieldsHigh correlation
Churn Score is highly correlated with Churn Value and 1 other fieldsHigh correlation
Churn is highly correlated with Churn Value and 1 other fieldsHigh correlation
Zip Code is highly correlated with Latitude and 1 other fieldsHigh correlation
Latitude is highly correlated with Zip Code and 1 other fieldsHigh correlation
Longitude is highly correlated with Zip Code and 1 other fieldsHigh correlation
Tenure Months is highly correlated with Total ChargesHigh correlation
Monthly Charges is highly correlated with Total ChargesHigh correlation
Total Charges is highly correlated with Tenure Months and 1 other fieldsHigh correlation
Churn Value is highly correlated with Churn Score and 1 other fieldsHigh correlation
Churn Score is highly correlated with Churn Value and 1 other fieldsHigh correlation
Churn is highly correlated with Churn Value and 1 other fieldsHigh correlation
Zip Code is highly correlated with LatitudeHigh correlation
Latitude is highly correlated with Zip Code and 1 other fieldsHigh correlation
Longitude is highly correlated with LatitudeHigh correlation
Tenure Months is highly correlated with Total ChargesHigh correlation
Total Charges is highly correlated with Tenure MonthsHigh correlation
Churn Value is highly correlated with Churn Score and 1 other fieldsHigh correlation
Churn Score is highly correlated with Churn Value and 1 other fieldsHigh correlation
Churn is highly correlated with Churn Value and 1 other fieldsHigh correlation
Device Protection is highly correlated with Count and 2 other fieldsHigh correlation
Count is highly correlated with Device Protection and 21 other fieldsHigh correlation
Internet Service is highly correlated with Count and 2 other fieldsHigh correlation
Payment Method is highly correlated with Count and 2 other fieldsHigh correlation
Streaming TV is highly correlated with Count and 3 other fieldsHigh correlation
Online Backup is highly correlated with Count and 2 other fieldsHigh correlation
State is highly correlated with Device Protection and 21 other fieldsHigh correlation
Partner is highly correlated with Count and 2 other fieldsHigh correlation
Online Security is highly correlated with Count and 2 other fieldsHigh correlation
Gender is highly correlated with Count and 2 other fieldsHigh correlation
Churn Label is highly correlated with Count and 5 other fieldsHigh correlation
Churn is highly correlated with Count and 5 other fieldsHigh correlation
Paperless Billing is highly correlated with Count and 2 other fieldsHigh correlation
Tech Support is highly correlated with Count and 2 other fieldsHigh correlation
Churn Reason is highly correlated with Count and 5 other fieldsHigh correlation
Dependents is highly correlated with Count and 2 other fieldsHigh correlation
Contract is highly correlated with Count and 2 other fieldsHigh correlation
Senior Citizen is highly correlated with Count and 2 other fieldsHigh correlation
Multiple Lines is highly correlated with Count and 3 other fieldsHigh correlation
Country is highly correlated with Device Protection and 21 other fieldsHigh correlation
Streaming Movies is highly correlated with Count and 3 other fieldsHigh correlation
Churn Value is highly correlated with Count and 5 other fieldsHigh correlation
Phone Service is highly correlated with Count and 3 other fieldsHigh correlation
Zip Code is highly correlated with Latitude and 1 other fieldsHigh correlation
Latitude is highly correlated with Zip Code and 1 other fieldsHigh correlation
Longitude is highly correlated with Zip Code and 1 other fieldsHigh correlation
Partner is highly correlated with DependentsHigh correlation
Dependents is highly correlated with PartnerHigh correlation
Tenure Months is highly correlated with Contract and 1 other fieldsHigh correlation
Phone Service is highly correlated with Multiple Lines and 1 other fieldsHigh correlation
Multiple Lines is highly correlated with Phone Service and 2 other fieldsHigh correlation
Internet Service is highly correlated with Multiple Lines and 3 other fieldsHigh correlation
Online Security is highly correlated with Tech Support and 2 other fieldsHigh correlation
Online Backup is highly correlated with Monthly Charges and 1 other fieldsHigh correlation
Device Protection is highly correlated with Streaming TV and 3 other fieldsHigh correlation
Tech Support is highly correlated with Online Security and 2 other fieldsHigh correlation
Streaming TV is highly correlated with Device Protection and 3 other fieldsHigh correlation
Streaming Movies is highly correlated with Device Protection and 3 other fieldsHigh correlation
Contract is highly correlated with Tenure Months and 2 other fieldsHigh correlation
Monthly Charges is highly correlated with Phone Service and 9 other fieldsHigh correlation
Total Charges is highly correlated with Tenure Months and 9 other fieldsHigh correlation
Churn Label is highly correlated with Churn Value and 2 other fieldsHigh correlation
Churn Value is highly correlated with Churn Label and 2 other fieldsHigh correlation
Churn Score is highly correlated with Churn Label and 2 other fieldsHigh correlation
Churn is highly correlated with Churn Label and 2 other fieldsHigh correlation
Churn Reason has 5163 (73.4%) missing values Missing
CustomerID is uniformly distributed Uniform
Lat Long is uniformly distributed Uniform
CustomerID has unique values Unique

Reproduction

Analysis started2023-03-07 00:32:54.774934
Analysis finished2023-03-07 00:33:00.542773
Duration5.77 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

CustomerID
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct7032
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size55.1 KiB
3668-QPYBK
 
1
9169-BSVIN
 
1
0206-OYVOC
 
1
6418-HNFED
 
1
8805-JNRAZ
 
1
Other values (7027)
7027 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters70320
Distinct characters37
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7032 ?
Unique (%)100.0%

Sample

1st row3668-QPYBK
2nd row9237-HQITU
3rd row9305-CDSKC
4th row7892-POOKP
5th row0280-XJGEX

Common Values

ValueCountFrequency (%)
3668-QPYBK1
 
< 0.1%
9169-BSVIN1
 
< 0.1%
0206-OYVOC1
 
< 0.1%
6418-HNFED1
 
< 0.1%
8805-JNRAZ1
 
< 0.1%
8439-LTUGF1
 
< 0.1%
1767-TGTKO1
 
< 0.1%
1194-HVAIF1
 
< 0.1%
1080-BWSYE1
 
< 0.1%
8387-UGUSU1
 
< 0.1%
Other values (7022)7022
99.9%

Length

2023-03-06T16:33:00.571048image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
3668-qpybk1
 
< 0.1%
8168-uqwwf1
 
< 0.1%
7892-pookp1
 
< 0.1%
0280-xjgex1
 
< 0.1%
4190-mfluw1
 
< 0.1%
8779-qrdmv1
 
< 0.1%
1066-jksgk1
 
< 0.1%
6467-chfzw1
 
< 0.1%
8665-utdhz1
 
< 0.1%
8773-hhuoz1
 
< 0.1%
Other values (7022)7022
99.9%

Most occurring characters

ValueCountFrequency (%)
-7032
 
10.0%
22894
 
4.1%
92879
 
4.1%
62868
 
4.1%
72828
 
4.0%
02828
 
4.0%
82812
 
4.0%
52805
 
4.0%
32785
 
4.0%
12721
 
3.9%
Other values (27)37868
53.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter35160
50.0%
Decimal Number28128
40.0%
Dash Punctuation7032
 
10.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O1438
 
4.1%
H1396
 
4.0%
B1393
 
4.0%
S1384
 
3.9%
V1377
 
3.9%
T1372
 
3.9%
C1367
 
3.9%
Z1364
 
3.9%
K1362
 
3.9%
F1362
 
3.9%
Other values (16)21345
60.7%
Decimal Number
ValueCountFrequency (%)
22894
10.3%
92879
10.2%
62868
10.2%
72828
10.1%
02828
10.1%
82812
10.0%
52805
10.0%
32785
9.9%
12721
9.7%
42708
9.6%
Dash Punctuation
ValueCountFrequency (%)
-7032
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common35160
50.0%
Latin35160
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
O1438
 
4.1%
H1396
 
4.0%
B1393
 
4.0%
S1384
 
3.9%
V1377
 
3.9%
T1372
 
3.9%
C1367
 
3.9%
Z1364
 
3.9%
K1362
 
3.9%
F1362
 
3.9%
Other values (16)21345
60.7%
Common
ValueCountFrequency (%)
-7032
20.0%
22894
8.2%
92879
8.2%
62868
8.2%
72828
8.0%
02828
8.0%
82812
 
8.0%
52805
 
8.0%
32785
 
7.9%
12721
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII70320
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
-7032
 
10.0%
22894
 
4.1%
92879
 
4.1%
62868
 
4.1%
72828
 
4.0%
02828
 
4.0%
82812
 
4.0%
52805
 
4.0%
32785
 
4.0%
12721
 
3.9%
Other values (27)37868
53.9%

Count
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.1 KiB
1
7032 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters7032
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
17032
100.0%

Length

2023-03-06T16:33:00.611221image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-06T16:33:00.653755image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
17032
100.0%

Most occurring characters

ValueCountFrequency (%)
17032
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number7032
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
17032
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common7032
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
17032
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII7032
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
17032
100.0%

Country
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.1 KiB
United States
7032 

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters91416
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnited States
2nd rowUnited States
3rd rowUnited States
4th rowUnited States
5th rowUnited States

Common Values

ValueCountFrequency (%)
United States7032
100.0%

Length

2023-03-06T16:33:00.689709image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-06T16:33:00.727434image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
united7032
50.0%
states7032
50.0%

Most occurring characters

ValueCountFrequency (%)
t21096
23.1%
e14064
15.4%
U7032
 
7.7%
n7032
 
7.7%
i7032
 
7.7%
d7032
 
7.7%
7032
 
7.7%
S7032
 
7.7%
a7032
 
7.7%
s7032
 
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter70320
76.9%
Uppercase Letter14064
 
15.4%
Space Separator7032
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t21096
30.0%
e14064
20.0%
n7032
 
10.0%
i7032
 
10.0%
d7032
 
10.0%
a7032
 
10.0%
s7032
 
10.0%
Uppercase Letter
ValueCountFrequency (%)
U7032
50.0%
S7032
50.0%
Space Separator
ValueCountFrequency (%)
7032
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin84384
92.3%
Common7032
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t21096
25.0%
e14064
16.7%
U7032
 
8.3%
n7032
 
8.3%
i7032
 
8.3%
d7032
 
8.3%
S7032
 
8.3%
a7032
 
8.3%
s7032
 
8.3%
Common
ValueCountFrequency (%)
7032
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII91416
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t21096
23.1%
e14064
15.4%
U7032
 
7.7%
n7032
 
7.7%
i7032
 
7.7%
d7032
 
7.7%
7032
 
7.7%
S7032
 
7.7%
a7032
 
7.7%
s7032
 
7.7%

State
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.1 KiB
California
7032 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters70320
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCalifornia
2nd rowCalifornia
3rd rowCalifornia
4th rowCalifornia
5th rowCalifornia

Common Values

ValueCountFrequency (%)
California7032
100.0%

Length

2023-03-06T16:33:00.763948image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-06T16:33:00.804383image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
california7032
100.0%

Most occurring characters

ValueCountFrequency (%)
a14064
20.0%
i14064
20.0%
C7032
10.0%
l7032
10.0%
f7032
10.0%
o7032
10.0%
r7032
10.0%
n7032
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter63288
90.0%
Uppercase Letter7032
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a14064
22.2%
i14064
22.2%
l7032
11.1%
f7032
11.1%
o7032
11.1%
r7032
11.1%
n7032
11.1%
Uppercase Letter
ValueCountFrequency (%)
C7032
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin70320
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a14064
20.0%
i14064
20.0%
C7032
10.0%
l7032
10.0%
f7032
10.0%
o7032
10.0%
r7032
10.0%
n7032
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII70320
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a14064
20.0%
i14064
20.0%
C7032
10.0%
l7032
10.0%
f7032
10.0%
o7032
10.0%
r7032
10.0%
n7032
10.0%

City
Categorical

HIGH CARDINALITY

Distinct1129
Distinct (%)16.1%
Missing0
Missing (%)0.0%
Memory size55.1 KiB
Los Angeles
 
304
San Diego
 
150
San Jose
 
112
Sacramento
 
108
San Francisco
 
104
Other values (1124)
6254 

Length

Max length22
Median length19
Mean length9.22298066
Min length3

Characters and Unicode

Total characters64856
Distinct characters52
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLos Angeles
2nd rowLos Angeles
3rd rowLos Angeles
4th rowLos Angeles
5th rowLos Angeles

Common Values

ValueCountFrequency (%)
Los Angeles304
 
4.3%
San Diego150
 
2.1%
San Jose112
 
1.6%
Sacramento108
 
1.5%
San Francisco104
 
1.5%
Fresno64
 
0.9%
Long Beach60
 
0.9%
Oakland52
 
0.7%
Stockton44
 
0.6%
Bakersfield40
 
0.6%
Other values (1119)5994
85.2%

Length

2023-03-06T16:33:00.843525image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
san582
 
5.6%
los349
 
3.4%
angeles304
 
3.0%
valley183
 
1.8%
santa182
 
1.8%
beach172
 
1.7%
city164
 
1.6%
diego150
 
1.5%
sacramento116
 
1.1%
jose112
 
1.1%
Other values (1131)7988
77.5%

Most occurring characters

ValueCountFrequency (%)
a6946
 
10.7%
e6137
 
9.5%
n5102
 
7.9%
o4936
 
7.6%
l4003
 
6.2%
r3644
 
5.6%
i3366
 
5.2%
3270
 
5.0%
s2921
 
4.5%
t2703
 
4.2%
Other values (42)21828
33.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter51284
79.1%
Uppercase Letter10302
 
15.9%
Space Separator3270
 
5.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a6946
13.5%
e6137
12.0%
n5102
9.9%
o4936
9.6%
l4003
7.8%
r3644
 
7.1%
i3366
 
6.6%
s2921
 
5.7%
t2703
 
5.3%
d1644
 
3.2%
Other values (16)9882
19.3%
Uppercase Letter
ValueCountFrequency (%)
S1462
14.2%
C1008
 
9.8%
L898
 
8.7%
B749
 
7.3%
A672
 
6.5%
M618
 
6.0%
P613
 
6.0%
R455
 
4.4%
F442
 
4.3%
V419
 
4.1%
Other values (15)2966
28.8%
Space Separator
ValueCountFrequency (%)
3270
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin61586
95.0%
Common3270
 
5.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a6946
 
11.3%
e6137
 
10.0%
n5102
 
8.3%
o4936
 
8.0%
l4003
 
6.5%
r3644
 
5.9%
i3366
 
5.5%
s2921
 
4.7%
t2703
 
4.4%
d1644
 
2.7%
Other values (41)20184
32.8%
Common
ValueCountFrequency (%)
3270
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII64856
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a6946
 
10.7%
e6137
 
9.5%
n5102
 
7.9%
o4936
 
7.6%
l4003
 
6.2%
r3644
 
5.6%
i3366
 
5.2%
3270
 
5.0%
s2921
 
4.5%
t2703
 
4.2%
Other values (42)21828
33.7%

Zip Code
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1652
Distinct (%)23.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean93523.03257
Minimum90001
Maximum96161
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size55.1 KiB
2023-03-06T16:33:00.895709image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum90001
5-th percentile90232
Q192102
median93552.5
Q395354
95-th percentile96031.45
Maximum96161
Range6160
Interquartile range (IQR)3252

Descriptive statistics

Standard deviation1865.515958
Coefficient of variation (CV)0.01994712861
Kurtosis-1.153822744
Mean93523.03257
Median Absolute Deviation (MAD)1642
Skewness-0.2516583316
Sum657653965
Variance3480149.791
MonotonicityNot monotonic
2023-03-06T16:33:00.942417image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
900035
 
0.1%
922845
 
0.1%
919415
 
0.1%
919165
 
0.1%
919135
 
0.1%
919115
 
0.1%
917865
 
0.1%
917845
 
0.1%
917805
 
0.1%
917655
 
0.1%
Other values (1642)6982
99.3%
ValueCountFrequency (%)
900015
0.1%
900025
0.1%
900035
0.1%
900045
0.1%
900055
0.1%
900065
0.1%
900075
0.1%
900085
0.1%
900105
0.1%
900115
0.1%
ValueCountFrequency (%)
961614
0.1%
961504
0.1%
961484
0.1%
961464
0.1%
961454
0.1%
961434
0.1%
961424
0.1%
961414
0.1%
961404
0.1%
961374
0.1%

Lat Long
Categorical

HIGH CARDINALITY
UNIFORM

Distinct1652
Distinct (%)23.5%
Missing0
Missing (%)0.0%
Memory size55.1 KiB
33.964131, -118.272783
 
5
34.159534, -116.425984
 
5
32.759327, -116.99726
 
5
32.912664, -116.635387
 
5
32.64164, -116.985026
 
5
Other values (1647)
7007 

Length

Max length22
Median length22
Mean length21.77673493
Min length18

Characters and Unicode

Total characters153134
Distinct characters14
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row33.964131, -118.272783
2nd row34.059281, -118.30742
3rd row34.048013, -118.293953
4th row34.062125, -118.315709
5th row34.039224, -118.266293

Common Values

ValueCountFrequency (%)
33.964131, -118.2727835
 
0.1%
34.159534, -116.4259845
 
0.1%
32.759327, -116.997265
 
0.1%
32.912664, -116.6353875
 
0.1%
32.64164, -116.9850265
 
0.1%
32.607964, -117.0594595
 
0.1%
34.105493, -117.6609345
 
0.1%
34.141146, -117.6555835
 
0.1%
34.101608, -118.0558485
 
0.1%
33.992416, -117.8078745
 
0.1%
Other values (1642)6982
99.3%

Length

2023-03-06T16:33:00.985940image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
121.9948138
 
0.1%
33.9641315
 
< 0.1%
33.7192215
 
< 0.1%
118.3810615
 
< 0.1%
34.1282845
 
< 0.1%
118.0477325
 
< 0.1%
33.8078825
 
< 0.1%
118.3479575
 
< 0.1%
34.0152175
 
< 0.1%
118.1099625
 
< 0.1%
Other values (3293)14011
99.6%

Most occurring characters

ValueCountFrequency (%)
120367
13.3%
316415
10.7%
.14064
9.2%
213686
8.9%
810983
 
7.2%
710764
 
7.0%
410493
 
6.9%
99532
 
6.2%
68936
 
5.8%
58670
 
5.7%
Other values (4)29224
19.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number117974
77.0%
Other Punctuation21096
 
13.8%
Space Separator7032
 
4.6%
Dash Punctuation7032
 
4.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
120367
17.3%
316415
13.9%
213686
11.6%
810983
9.3%
710764
9.1%
410493
8.9%
99532
8.1%
68936
7.6%
58670
7.3%
08128
 
6.9%
Other Punctuation
ValueCountFrequency (%)
.14064
66.7%
,7032
33.3%
Space Separator
ValueCountFrequency (%)
7032
100.0%
Dash Punctuation
ValueCountFrequency (%)
-7032
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common153134
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
120367
13.3%
316415
10.7%
.14064
9.2%
213686
8.9%
810983
 
7.2%
710764
 
7.0%
410493
 
6.9%
99532
 
6.2%
68936
 
5.8%
58670
 
5.7%
Other values (4)29224
19.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII153134
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
120367
13.3%
316415
10.7%
.14064
9.2%
213686
8.9%
810983
 
7.2%
710764
 
7.0%
410493
 
6.9%
99532
 
6.2%
68936
 
5.8%
58670
 
5.7%
Other values (4)29224
19.1%

Latitude
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1652
Distinct (%)23.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.28330693
Minimum32.555828
Maximum41.962127
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size55.1 KiB
2023-03-06T16:33:01.032028image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum32.555828
5-th percentile32.980678
Q134.030915
median36.391777
Q338.227285
95-th percentile40.55815235
Maximum41.962127
Range9.406299
Interquartile range (IQR)4.19637

Descriptive statistics

Standard deviation2.456118329
Coefficient of variation (CV)0.06769279145
Kurtosis-1.136114505
Mean36.28330693
Median Absolute Deviation (MAD)2.263106
Skewness0.3031705998
Sum255144.2143
Variance6.032517246
MonotonicityNot monotonic
2023-03-06T16:33:01.081119image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33.9641315
 
0.1%
34.1595345
 
0.1%
32.7593275
 
0.1%
32.9126645
 
0.1%
32.641645
 
0.1%
32.6079645
 
0.1%
34.1054935
 
0.1%
34.1411465
 
0.1%
34.1016085
 
0.1%
33.9924165
 
0.1%
Other values (1642)6982
99.3%
ValueCountFrequency (%)
32.5558285
0.1%
32.5781035
0.1%
32.5791345
0.1%
32.5875575
0.1%
32.6050125
0.1%
32.6079645
0.1%
32.6194655
0.1%
32.6229995
0.1%
32.6367925
0.1%
32.641645
0.1%
ValueCountFrequency (%)
41.9621274
0.1%
41.9506834
0.1%
41.9492164
0.1%
41.9322074
0.1%
41.9241744
0.1%
41.8679084
0.1%
41.8319014
0.1%
41.8165954
0.1%
41.8135214
0.1%
41.7697094
0.1%

Longitude
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1651
Distinct (%)23.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-119.7992151
Minimum-124.301372
Maximum-114.192901
Zeros0
Zeros (%)0.0%
Negative7032
Negative (%)100.0%
Memory size55.1 KiB
2023-03-06T16:33:01.126073image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum-124.301372
5-th percentile-122.998726
Q1-121.815412
median-119.73541
Q3-118.043237
95-th percentile-116.76058
Maximum-114.192901
Range10.108471
Interquartile range (IQR)3.772175

Descriptive statistics

Standard deviation2.157587776
Coefficient of variation (CV)-0.01801003266
Kurtosis-1.13533181
Mean-119.7992151
Median Absolute Deviation (MAD)1.829311
Skewness-0.03974068342
Sum-842428.0805
Variance4.655185013
MonotonicityNot monotonic
2023-03-06T16:33:01.174853image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-121.9948138
 
0.1%
-118.3459285
 
0.1%
-116.997265
 
0.1%
-116.6353875
 
0.1%
-116.9850265
 
0.1%
-117.0594595
 
0.1%
-117.6609345
 
0.1%
-117.6555835
 
0.1%
-118.0558485
 
0.1%
-117.8078745
 
0.1%
Other values (1641)6979
99.2%
ValueCountFrequency (%)
-124.3013724
0.1%
-124.2400514
0.1%
-124.2173784
0.1%
-124.2109024
0.1%
-124.1899774
0.1%
-124.1632344
0.1%
-124.154284
0.1%
-124.1215044
0.1%
-124.1088974
0.1%
-124.0987394
0.1%
ValueCountFrequency (%)
-114.1929015
0.1%
-114.365145
0.1%
-114.7022564
0.1%
-114.716125
0.1%
-114.7179645
0.1%
-114.7583345
0.1%
-114.8507845
0.1%
-115.1528655
0.1%
-115.1918575
0.1%
-115.2570095
0.1%

Gender
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.1 KiB
Male
3549 
Female
3483 

Length

Max length6
Median length4
Mean length4.990614334
Min length4

Characters and Unicode

Total characters35094
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMale
2nd rowFemale
3rd rowFemale
4th rowFemale
5th rowMale

Common Values

ValueCountFrequency (%)
Male3549
50.5%
Female3483
49.5%

Length

2023-03-06T16:33:01.219910image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-06T16:33:01.262026image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
male3549
50.5%
female3483
49.5%

Most occurring characters

ValueCountFrequency (%)
e10515
30.0%
a7032
20.0%
l7032
20.0%
M3549
 
10.1%
F3483
 
9.9%
m3483
 
9.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter28062
80.0%
Uppercase Letter7032
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e10515
37.5%
a7032
25.1%
l7032
25.1%
m3483
 
12.4%
Uppercase Letter
ValueCountFrequency (%)
M3549
50.5%
F3483
49.5%

Most occurring scripts

ValueCountFrequency (%)
Latin35094
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e10515
30.0%
a7032
20.0%
l7032
20.0%
M3549
 
10.1%
F3483
 
9.9%
m3483
 
9.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII35094
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e10515
30.0%
a7032
20.0%
l7032
20.0%
M3549
 
10.1%
F3483
 
9.9%
m3483
 
9.9%

Senior Citizen
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
False
5890 
True
1142 
ValueCountFrequency (%)
False5890
83.8%
True1142
 
16.2%
2023-03-06T16:33:01.300459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Partner
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
False
3639 
True
3393 
ValueCountFrequency (%)
False3639
51.7%
True3393
48.3%
2023-03-06T16:33:01.337161image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Dependents
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
False
5412 
True
1620 
ValueCountFrequency (%)
False5412
77.0%
True1620
 
23.0%
2023-03-06T16:33:01.374381image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Tenure Months
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct72
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.42178612
Minimum1
Maximum72
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size55.1 KiB
2023-03-06T16:33:01.413048image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q19
median29
Q355
95-th percentile72
Maximum72
Range71
Interquartile range (IQR)46

Descriptive statistics

Standard deviation24.54525971
Coefficient of variation (CV)0.7570606881
Kurtosis-1.38782258
Mean32.42178612
Median Absolute Deviation (MAD)22
Skewness0.2377308319
Sum227990
Variance602.4697742
MonotonicityNot monotonic
2023-03-06T16:33:01.571241image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1613
 
8.7%
72362
 
5.1%
2238
 
3.4%
3200
 
2.8%
4176
 
2.5%
71170
 
2.4%
5133
 
1.9%
7131
 
1.9%
8123
 
1.7%
9119
 
1.7%
Other values (62)4767
67.8%
ValueCountFrequency (%)
1613
8.7%
2238
 
3.4%
3200
 
2.8%
4176
 
2.5%
5133
 
1.9%
6110
 
1.6%
7131
 
1.9%
8123
 
1.7%
9119
 
1.7%
10116
 
1.6%
ValueCountFrequency (%)
72362
5.1%
71170
2.4%
70119
 
1.7%
6995
 
1.4%
68100
 
1.4%
6798
 
1.4%
6689
 
1.3%
6576
 
1.1%
6480
 
1.1%
6372
 
1.0%

Phone Service
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
True
6352 
False
680 
ValueCountFrequency (%)
True6352
90.3%
False680
 
9.7%
2023-03-06T16:33:01.615368image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Multiple Lines
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.1 KiB
No
3385 
Yes
2967 
No phone service
680 

Length

Max length16
Median length3
Mean length3.775739477
Min length2

Characters and Unicode

Total characters26551
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowYes
4th rowYes
5th rowYes

Common Values

ValueCountFrequency (%)
No3385
48.1%
Yes2967
42.2%
No phone service680
 
9.7%

Length

2023-03-06T16:33:01.651286image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-06T16:33:01.693457image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
no4065
48.4%
yes2967
35.4%
phone680
 
8.1%
service680
 
8.1%

Most occurring characters

ValueCountFrequency (%)
e5007
18.9%
o4745
17.9%
N4065
15.3%
s3647
13.7%
Y2967
11.2%
1360
 
5.1%
p680
 
2.6%
h680
 
2.6%
n680
 
2.6%
r680
 
2.6%
Other values (3)2040
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter18159
68.4%
Uppercase Letter7032
 
26.5%
Space Separator1360
 
5.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e5007
27.6%
o4745
26.1%
s3647
20.1%
p680
 
3.7%
h680
 
3.7%
n680
 
3.7%
r680
 
3.7%
v680
 
3.7%
i680
 
3.7%
c680
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
N4065
57.8%
Y2967
42.2%
Space Separator
ValueCountFrequency (%)
1360
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin25191
94.9%
Common1360
 
5.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e5007
19.9%
o4745
18.8%
N4065
16.1%
s3647
14.5%
Y2967
11.8%
p680
 
2.7%
h680
 
2.7%
n680
 
2.7%
r680
 
2.7%
v680
 
2.7%
Other values (2)1360
 
5.4%
Common
ValueCountFrequency (%)
1360
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII26551
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e5007
18.9%
o4745
17.9%
N4065
15.3%
s3647
13.7%
Y2967
11.2%
1360
 
5.1%
p680
 
2.6%
h680
 
2.6%
n680
 
2.6%
r680
 
2.6%
Other values (3)2040
7.7%

Internet Service
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.1 KiB
Fiber optic
3096 
DSL
2416 
No
1520 

Length

Max length11
Median length3
Mean length6.306029579
Min length2

Characters and Unicode

Total characters44344
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDSL
2nd rowFiber optic
3rd rowFiber optic
4th rowFiber optic
5th rowFiber optic

Common Values

ValueCountFrequency (%)
Fiber optic3096
44.0%
DSL2416
34.4%
No1520
21.6%

Length

2023-03-06T16:33:01.736976image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-06T16:33:01.788916image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
fiber3096
30.6%
optic3096
30.6%
dsl2416
23.9%
no1520
15.0%

Most occurring characters

ValueCountFrequency (%)
i6192
14.0%
o4616
10.4%
F3096
 
7.0%
b3096
 
7.0%
e3096
 
7.0%
r3096
 
7.0%
3096
 
7.0%
p3096
 
7.0%
t3096
 
7.0%
c3096
 
7.0%
Other values (4)8768
19.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter29384
66.3%
Uppercase Letter11864
26.8%
Space Separator3096
 
7.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i6192
21.1%
o4616
15.7%
b3096
10.5%
e3096
10.5%
r3096
10.5%
p3096
10.5%
t3096
10.5%
c3096
10.5%
Uppercase Letter
ValueCountFrequency (%)
F3096
26.1%
D2416
20.4%
S2416
20.4%
L2416
20.4%
N1520
12.8%
Space Separator
ValueCountFrequency (%)
3096
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin41248
93.0%
Common3096
 
7.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i6192
15.0%
o4616
11.2%
F3096
7.5%
b3096
7.5%
e3096
7.5%
r3096
7.5%
p3096
7.5%
t3096
7.5%
c3096
7.5%
D2416
 
5.9%
Other values (3)6352
15.4%
Common
ValueCountFrequency (%)
3096
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII44344
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i6192
14.0%
o4616
10.4%
F3096
 
7.0%
b3096
 
7.0%
e3096
 
7.0%
r3096
 
7.0%
3096
 
7.0%
p3096
 
7.0%
t3096
 
7.0%
c3096
 
7.0%
Other values (4)8768
19.8%

Online Security
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
False
5017 
True
2015 
ValueCountFrequency (%)
False5017
71.3%
True2015
28.7%
2023-03-06T16:33:01.830786image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Online Backup
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
False
4607 
True
2425 
ValueCountFrequency (%)
False4607
65.5%
True2425
34.5%
2023-03-06T16:33:01.869734image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Device Protection
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
False
4614 
True
2418 
ValueCountFrequency (%)
False4614
65.6%
True2418
34.4%
2023-03-06T16:33:01.907957image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Tech Support
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
False
4992 
True
2040 
ValueCountFrequency (%)
False4992
71.0%
True2040
29.0%
2023-03-06T16:33:01.946173image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Streaming TV
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
False
4329 
True
2703 
ValueCountFrequency (%)
False4329
61.6%
True2703
38.4%
2023-03-06T16:33:01.982784image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Streaming Movies
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
False
4301 
True
2731 
ValueCountFrequency (%)
False4301
61.2%
True2731
38.8%
2023-03-06T16:33:02.020004image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Contract
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.1 KiB
Month-to-month
3875 
Two year
1685 
One year
1472 

Length

Max length14
Median length14
Mean length11.30631399
Min length8

Characters and Unicode

Total characters79506
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMonth-to-month
2nd rowMonth-to-month
3rd rowMonth-to-month
4th rowMonth-to-month
5th rowMonth-to-month

Common Values

ValueCountFrequency (%)
Month-to-month3875
55.1%
Two year1685
24.0%
One year1472
 
20.9%

Length

2023-03-06T16:33:02.056638image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-06T16:33:02.099274image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
month-to-month3875
38.0%
year3157
31.0%
two1685
16.5%
one1472
 
14.4%

Most occurring characters

ValueCountFrequency (%)
o13310
16.7%
t11625
14.6%
n9222
11.6%
h7750
9.7%
-7750
9.7%
e4629
 
5.8%
M3875
 
4.9%
m3875
 
4.9%
3157
 
4.0%
y3157
 
4.0%
Other values (5)11156
14.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter61567
77.4%
Dash Punctuation7750
 
9.7%
Uppercase Letter7032
 
8.8%
Space Separator3157
 
4.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o13310
21.6%
t11625
18.9%
n9222
15.0%
h7750
12.6%
e4629
 
7.5%
m3875
 
6.3%
y3157
 
5.1%
a3157
 
5.1%
r3157
 
5.1%
w1685
 
2.7%
Uppercase Letter
ValueCountFrequency (%)
M3875
55.1%
T1685
24.0%
O1472
 
20.9%
Dash Punctuation
ValueCountFrequency (%)
-7750
100.0%
Space Separator
ValueCountFrequency (%)
3157
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin68599
86.3%
Common10907
 
13.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
o13310
19.4%
t11625
16.9%
n9222
13.4%
h7750
11.3%
e4629
 
6.7%
M3875
 
5.6%
m3875
 
5.6%
y3157
 
4.6%
a3157
 
4.6%
r3157
 
4.6%
Other values (3)4842
 
7.1%
Common
ValueCountFrequency (%)
-7750
71.1%
3157
28.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII79506
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o13310
16.7%
t11625
14.6%
n9222
11.6%
h7750
9.7%
-7750
9.7%
e4629
 
5.8%
M3875
 
4.9%
m3875
 
4.9%
3157
 
4.0%
y3157
 
4.0%
Other values (5)11156
14.0%

Paperless Billing
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
True
4168 
False
2864 
ValueCountFrequency (%)
True4168
59.3%
False2864
40.7%
2023-03-06T16:33:02.138127image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Payment Method
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size55.1 KiB
Electronic check
2365 
Mailed check
1604 
Bank transfer (automatic)
1542 
Credit card (automatic)
1521 

Length

Max length25
Median length23
Mean length18.57522753
Min length12

Characters and Unicode

Total characters130621
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMailed check
2nd rowElectronic check
3rd rowElectronic check
4th rowElectronic check
5th rowBank transfer (automatic)

Common Values

ValueCountFrequency (%)
Electronic check2365
33.6%
Mailed check1604
22.8%
Bank transfer (automatic)1542
21.9%
Credit card (automatic)1521
21.6%

Length

2023-03-06T16:33:02.174818image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-06T16:33:02.218503image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
check3969
23.2%
automatic3063
17.9%
electronic2365
13.8%
mailed1604
9.4%
bank1542
 
9.0%
transfer1542
 
9.0%
credit1521
 
8.9%
card1521
 
8.9%

Most occurring characters

ValueCountFrequency (%)
c17252
13.2%
a12335
 
9.4%
t11554
 
8.8%
e11001
 
8.4%
10095
 
7.7%
i8553
 
6.5%
r8491
 
6.5%
k5511
 
4.2%
n5449
 
4.2%
o5428
 
4.2%
Other values (13)34952
26.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter107368
82.2%
Space Separator10095
 
7.7%
Uppercase Letter7032
 
5.4%
Open Punctuation3063
 
2.3%
Close Punctuation3063
 
2.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c17252
16.1%
a12335
11.5%
t11554
10.8%
e11001
10.2%
i8553
8.0%
r8491
7.9%
k5511
 
5.1%
n5449
 
5.1%
o5428
 
5.1%
d4646
 
4.3%
Other values (6)17148
16.0%
Uppercase Letter
ValueCountFrequency (%)
E2365
33.6%
M1604
22.8%
B1542
21.9%
C1521
21.6%
Space Separator
ValueCountFrequency (%)
10095
100.0%
Open Punctuation
ValueCountFrequency (%)
(3063
100.0%
Close Punctuation
ValueCountFrequency (%)
)3063
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin114400
87.6%
Common16221
 
12.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
c17252
15.1%
a12335
10.8%
t11554
10.1%
e11001
9.6%
i8553
 
7.5%
r8491
 
7.4%
k5511
 
4.8%
n5449
 
4.8%
o5428
 
4.7%
d4646
 
4.1%
Other values (10)24180
21.1%
Common
ValueCountFrequency (%)
10095
62.2%
(3063
 
18.9%
)3063
 
18.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII130621
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c17252
13.2%
a12335
 
9.4%
t11554
 
8.8%
e11001
 
8.4%
10095
 
7.7%
i8553
 
6.5%
r8491
 
6.5%
k5511
 
4.2%
n5449
 
4.2%
o5428
 
4.2%
Other values (13)34952
26.8%

Monthly Charges
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1584
Distinct (%)22.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.79820819
Minimum18.25
Maximum118.75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size55.1 KiB
2023-03-06T16:33:02.262246image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum18.25
5-th percentile19.65
Q135.5875
median70.35
Q389.8625
95-th percentile107.4225
Maximum118.75
Range100.5
Interquartile range (IQR)54.275

Descriptive statistics

Standard deviation30.08597388
Coefficient of variation (CV)0.464302559
Kurtosis-1.256156424
Mean64.79820819
Median Absolute Deviation (MAD)24.05
Skewness-0.2221029277
Sum455661
Variance905.1658246
MonotonicityNot monotonic
2023-03-06T16:33:02.312124image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20.0561
 
0.9%
19.944
 
0.6%
19.9544
 
0.6%
19.8544
 
0.6%
19.6543
 
0.6%
2042
 
0.6%
19.742
 
0.6%
20.1540
 
0.6%
19.5540
 
0.6%
19.7539
 
0.6%
Other values (1574)6593
93.8%
ValueCountFrequency (%)
18.251
 
< 0.1%
18.41
 
< 0.1%
18.551
 
< 0.1%
18.72
 
< 0.1%
18.751
 
< 0.1%
18.87
0.1%
18.855
0.1%
18.92
 
< 0.1%
18.956
0.1%
197
0.1%
ValueCountFrequency (%)
118.751
< 0.1%
118.651
< 0.1%
118.62
< 0.1%
118.351
< 0.1%
118.21
< 0.1%
117.81
< 0.1%
117.61
< 0.1%
117.51
< 0.1%
117.451
< 0.1%
117.351
< 0.1%

Total Charges
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct6530
Distinct (%)92.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2283.300441
Minimum18.8
Maximum8684.8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size55.1 KiB
2023-03-06T16:33:02.356435image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum18.8
5-th percentile49.605
Q1401.45
median1397.475
Q33794.7375
95-th percentile6923.59
Maximum8684.8
Range8666
Interquartile range (IQR)3393.2875

Descriptive statistics

Standard deviation2266.771362
Coefficient of variation (CV)0.992760883
Kurtosis-0.2317987609
Mean2283.300441
Median Absolute Deviation (MAD)1222.8
Skewness0.9616424997
Sum16056168.7
Variance5138252.407
MonotonicityNot monotonic
2023-03-06T16:33:02.401751image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20.211
 
0.2%
19.759
 
0.1%
20.058
 
0.1%
19.658
 
0.1%
19.98
 
0.1%
19.557
 
0.1%
45.37
 
0.1%
20.156
 
0.1%
19.456
 
0.1%
20.256
 
0.1%
Other values (6520)6956
98.9%
ValueCountFrequency (%)
18.81
 
< 0.1%
18.852
< 0.1%
18.91
 
< 0.1%
191
 
< 0.1%
19.051
 
< 0.1%
19.13
< 0.1%
19.151
 
< 0.1%
19.24
0.1%
19.253
< 0.1%
19.34
0.1%
ValueCountFrequency (%)
8684.81
< 0.1%
8672.451
< 0.1%
8670.11
< 0.1%
8594.41
< 0.1%
8564.751
< 0.1%
8547.151
< 0.1%
8543.251
< 0.1%
8529.51
< 0.1%
8496.71
< 0.1%
8477.71
< 0.1%

Churn Label
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
False
5163 
True
1869 
ValueCountFrequency (%)
False5163
73.4%
True1869
 
26.6%
2023-03-06T16:33:02.443428image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Churn Value
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.1 KiB
0
5163 
1
1869 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters7032
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
05163
73.4%
11869
 
26.6%

Length

2023-03-06T16:33:02.479497image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-06T16:33:02.520700image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
05163
73.4%
11869
 
26.6%

Most occurring characters

ValueCountFrequency (%)
05163
73.4%
11869
 
26.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number7032
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
05163
73.4%
11869
 
26.6%

Most occurring scripts

ValueCountFrequency (%)
Common7032
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
05163
73.4%
11869
 
26.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII7032
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
05163
73.4%
11869
 
26.6%

Churn Score
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct85
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean58.71530148
Minimum5
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size55.1 KiB
2023-03-06T16:33:02.559895image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile24
Q140
median61
Q375
95-th percentile94
Maximum100
Range95
Interquartile range (IQR)35

Descriptive statistics

Standard deviation21.53132105
Coefficient of variation (CV)0.3667071531
Kurtosis-1.005965193
Mean58.71530148
Median Absolute Deviation (MAD)17
Skewness-0.09107471235
Sum412886
Variance463.597786
MonotonicityNot monotonic
2023-03-06T16:33:02.606324image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
80151
 
2.1%
71148
 
2.1%
77145
 
2.1%
67143
 
2.0%
76141
 
2.0%
68140
 
2.0%
70140
 
2.0%
69139
 
2.0%
78138
 
2.0%
72137
 
1.9%
Other values (75)5610
79.8%
ValueCountFrequency (%)
51
 
< 0.1%
72
 
< 0.1%
82
 
< 0.1%
93
 
< 0.1%
2083
1.2%
2184
1.2%
2282
1.2%
2378
1.1%
2486
1.2%
2585
1.2%
ValueCountFrequency (%)
10050
0.7%
9954
0.8%
9850
0.7%
9764
0.9%
9652
0.7%
9543
0.6%
9446
0.7%
9347
0.7%
9248
0.7%
9145
0.6%

CLTV
Real number (ℝ≥0)

Distinct3435
Distinct (%)48.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4401.445108
Minimum2003
Maximum6500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size55.1 KiB
2023-03-06T16:33:02.650868image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum2003
5-th percentile2296.55
Q13469.75
median4527.5
Q35381
95-th percentile6087
Maximum6500
Range4497
Interquartile range (IQR)1911.25

Descriptive statistics

Standard deviation1182.414266
Coefficient of variation (CV)0.2686422838
Kurtosis-0.9333466291
Mean4401.445108
Median Absolute Deviation (MAD)922.5
Skewness-0.3113273648
Sum30950962
Variance1398103.496
MonotonicityNot monotonic
2023-03-06T16:33:02.695160image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
55468
 
0.1%
47417
 
0.1%
43697
 
0.1%
55277
 
0.1%
41157
 
0.1%
50927
 
0.1%
47457
 
0.1%
59157
 
0.1%
54617
 
0.1%
51377
 
0.1%
Other values (3425)6961
99.0%
ValueCountFrequency (%)
20033
< 0.1%
20043
< 0.1%
20061
 
< 0.1%
20074
0.1%
20081
 
< 0.1%
20092
< 0.1%
20103
< 0.1%
20112
< 0.1%
20132
< 0.1%
20141
 
< 0.1%
ValueCountFrequency (%)
65001
 
< 0.1%
64992
< 0.1%
64951
 
< 0.1%
64942
< 0.1%
64923
< 0.1%
64911
 
< 0.1%
64901
 
< 0.1%
64891
 
< 0.1%
64881
 
< 0.1%
64872
< 0.1%

Churn Reason
Categorical

HIGH CORRELATION
MISSING

Distinct20
Distinct (%)1.1%
Missing5163
Missing (%)73.4%
Memory size55.1 KiB
Attitude of support person
192 
Competitor offered higher download speeds
189 
Competitor offered more data
162 
Don't know
154 
Competitor made better offer
140 
Other values (15)
1032 

Length

Max length41
Median length31
Mean length25.19422151
Min length5

Characters and Unicode

Total characters47088
Distinct characters37
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCompetitor made better offer
2nd rowMoved
3rd rowMoved
4th rowMoved
5th rowCompetitor had better devices

Common Values

ValueCountFrequency (%)
Attitude of support person192
 
2.7%
Competitor offered higher download speeds189
 
2.7%
Competitor offered more data162
 
2.3%
Don't know154
 
2.2%
Competitor made better offer140
 
2.0%
Attitude of service provider135
 
1.9%
Competitor had better devices130
 
1.8%
Network reliability103
 
1.5%
Product dissatisfaction102
 
1.5%
Price too high98
 
1.4%
Other values (10)464
 
6.6%
(Missing)5163
73.4%

Length

2023-03-06T16:33:02.740952image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
competitor621
 
9.5%
of542
 
8.3%
offered351
 
5.3%
attitude327
 
5.0%
better270
 
4.1%
support231
 
3.5%
service224
 
3.4%
data219
 
3.3%
person192
 
2.9%
dissatisfaction191
 
2.9%
Other values (37)3396
51.7%

Most occurring characters

ValueCountFrequency (%)
e5658
12.0%
o4751
 
10.1%
4695
 
10.0%
t4427
 
9.4%
r3512
 
7.5%
i3114
 
6.6%
d2659
 
5.6%
s2225
 
4.7%
a1942
 
4.1%
f1891
 
4.0%
Other values (27)12214
25.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter40150
85.3%
Space Separator4695
 
10.0%
Uppercase Letter1957
 
4.2%
Other Punctuation198
 
0.4%
Dash Punctuation88
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e5658
14.1%
o4751
11.8%
t4427
11.0%
r3512
8.7%
i3114
 
7.8%
d2659
 
6.6%
s2225
 
5.5%
a1942
 
4.8%
f1891
 
4.7%
p1746
 
4.3%
Other values (13)8225
20.5%
Uppercase Letter
ValueCountFrequency (%)
C621
31.7%
A327
16.7%
P239
 
12.2%
L220
 
11.2%
D160
 
8.2%
N103
 
5.3%
S89
 
4.5%
W88
 
4.5%
E57
 
2.9%
M53
 
2.7%
Other Punctuation
ValueCountFrequency (%)
'154
77.8%
/44
 
22.2%
Space Separator
ValueCountFrequency (%)
4695
100.0%
Dash Punctuation
ValueCountFrequency (%)
-88
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin42107
89.4%
Common4981
 
10.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e5658
13.4%
o4751
11.3%
t4427
10.5%
r3512
 
8.3%
i3114
 
7.4%
d2659
 
6.3%
s2225
 
5.3%
a1942
 
4.6%
f1891
 
4.5%
p1746
 
4.1%
Other values (23)10182
24.2%
Common
ValueCountFrequency (%)
4695
94.3%
'154
 
3.1%
-88
 
1.8%
/44
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII47088
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e5658
12.0%
o4751
 
10.1%
4695
 
10.0%
t4427
 
9.4%
r3512
 
7.5%
i3114
 
6.6%
d2659
 
5.6%
s2225
 
4.7%
a1942
 
4.1%
f1891
 
4.0%
Other values (27)12214
25.9%

Churn
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.1 KiB
0.0
5163 
1.0
1869 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters21096
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.05163
73.4%
1.01869
 
26.6%

Length

2023-03-06T16:33:02.783207image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-06T16:33:02.825792image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0.05163
73.4%
1.01869
 
26.6%

Most occurring characters

ValueCountFrequency (%)
012195
57.8%
.7032
33.3%
11869
 
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number14064
66.7%
Other Punctuation7032
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
012195
86.7%
11869
 
13.3%
Other Punctuation
ValueCountFrequency (%)
.7032
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common21096
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
012195
57.8%
.7032
33.3%
11869
 
8.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII21096
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
012195
57.8%
.7032
33.3%
11869
 
8.9%

Interactions

2023-03-06T16:32:59.845674image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:57.830630image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.107956image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.370451image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.644400image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.032013image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.303321image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.576851image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.881764image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:57.870993image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.140719image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.405934image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.679553image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.066894image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.336266image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.613218image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:33:00.007759image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:57.905695image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.172864image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.439340image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.821103image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.098550image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.371916image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.644682image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:33:00.044505image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:57.940260image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.204966image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.476426image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.855380image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.135804image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.404355image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.678711image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:33:00.082203image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:57.975731image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.241686image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.510710image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.891828image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.170123image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.446647image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.715581image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:33:00.114962image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.009672image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.274790image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.545511image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.925622image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.202132image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.479256image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.749332image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:33:00.148021image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.041700image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.307104image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.577756image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.961311image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.234451image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.511071image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.783476image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:33:00.182656image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.075434image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.339323image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.612701image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:58.999165image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.267785image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.545402image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-03-06T16:32:59.814164image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2023-03-06T16:33:02.858458image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2023-03-06T16:33:02.918530image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2023-03-06T16:33:02.977165image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2023-03-06T16:33:03.043563image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2023-03-06T16:33:03.121477image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2023-03-06T16:33:00.268890image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-03-06T16:33:00.415015image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-03-06T16:33:00.496095image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

CustomerIDCountCountryStateCityZip CodeLat LongLatitudeLongitudeGenderSenior CitizenPartnerDependentsTenure MonthsPhone ServiceMultiple LinesInternet ServiceOnline SecurityOnline BackupDevice ProtectionTech SupportStreaming TVStreaming MoviesContractPaperless BillingPayment MethodMonthly ChargesTotal ChargesChurn LabelChurn ValueChurn ScoreCLTVChurn ReasonChurn
03668-QPYBK1United StatesCaliforniaLos Angeles9000333.964131, -118.27278333.964131-118.272783MaleNoNoNo2YesNoDSLYesYesNoNoNoNoMonth-to-monthYesMailed check53.85108.15Yes1863239Competitor made better offer1.0
19237-HQITU1United StatesCaliforniaLos Angeles9000534.059281, -118.3074234.059281-118.307420FemaleNoNoYes2YesNoFiber opticNoNoNoNoNoNoMonth-to-monthYesElectronic check70.70151.65Yes1672701Moved1.0
29305-CDSKC1United StatesCaliforniaLos Angeles9000634.048013, -118.29395334.048013-118.293953FemaleNoNoYes8YesYesFiber opticNoNoYesNoYesYesMonth-to-monthYesElectronic check99.65820.50Yes1865372Moved1.0
37892-POOKP1United StatesCaliforniaLos Angeles9001034.062125, -118.31570934.062125-118.315709FemaleNoYesYes28YesYesFiber opticNoNoYesYesYesYesMonth-to-monthYesElectronic check104.803046.05Yes1845003Moved1.0
40280-XJGEX1United StatesCaliforniaLos Angeles9001534.039224, -118.26629334.039224-118.266293MaleNoNoYes49YesYesFiber opticNoYesYesNoYesYesMonth-to-monthYesBank transfer (automatic)103.705036.30Yes1895340Competitor had better devices1.0
54190-MFLUW1United StatesCaliforniaLos Angeles9002034.066367, -118.30986834.066367-118.309868FemaleNoYesNo10YesNoDSLNoNoYesYesNoNoMonth-to-monthNoCredit card (automatic)55.20528.35Yes1785925Competitor offered higher download speeds1.0
68779-QRDMV1United StatesCaliforniaLos Angeles9002234.02381, -118.15658234.023810-118.156582MaleYesNoNo1NoNo phone serviceDSLNoNoYesNoNoYesMonth-to-monthYesElectronic check39.6539.65Yes11005433Competitor offered more data1.0
71066-JKSGK1United StatesCaliforniaLos Angeles9002434.066303, -118.43547934.066303-118.435479MaleNoNoNo1YesNoNoNoNoNoNoNoNoMonth-to-monthNoMailed check20.1520.15Yes1924832Competitor made better offer1.0
86467-CHFZW1United StatesCaliforniaLos Angeles9002834.099869, -118.32684334.099869-118.326843MaleNoYesYes47YesYesFiber opticNoYesNoNoYesYesMonth-to-monthYesElectronic check99.354749.15Yes1775789Competitor had better devices1.0
98665-UTDHZ1United StatesCaliforniaLos Angeles9002934.089953, -118.29482434.089953-118.294824MaleNoYesNo1NoNo phone serviceDSLNoYesNoNoNoNoMonth-to-monthNoElectronic check30.2030.20Yes1972915Competitor had better devices1.0

Last rows

CustomerIDCountCountryStateCityZip CodeLat LongLatitudeLongitudeGenderSenior CitizenPartnerDependentsTenure MonthsPhone ServiceMultiple LinesInternet ServiceOnline SecurityOnline BackupDevice ProtectionTech SupportStreaming TVStreaming MoviesContractPaperless BillingPayment MethodMonthly ChargesTotal ChargesChurn LabelChurn ValueChurn ScoreCLTVChurn ReasonChurn
70220871-OPBXW1United StatesCaliforniaTwentynine Palms9227734.17211, -115.76977334.172110-115.769773FemaleNoNoNo2YesNoNoNoNoNoNoNoNoMonth-to-monthYesMailed check20.0539.25No0805191NaN0.0
70233605-JISKB1United StatesCaliforniaTwentynine Palms9227834.457829, -116.13958934.457829-116.139589MaleYesYesNo55YesYesDSLYesYesNoNoNoNoOne yearNoCredit card (automatic)60.003316.10No0714212NaN0.0
70249767-FFLEM1United StatesCaliforniaWestmorland9228133.03679, -115.6050333.036790-115.605030MaleNoNoNo38YesNoFiber opticNoNoNoNoNoNoMonth-to-monthYesCredit card (automatic)69.502625.25No0354591NaN0.0
70258456-QDAVC1United StatesCaliforniaWinterhaven9228332.852947, -114.85078432.852947-114.850784MaleNoNoNo19YesNoFiber opticNoNoNoNoYesNoMonth-to-monthYesBank transfer (automatic)78.701495.10No0202464NaN0.0
70267750-EYXWZ1United StatesCaliforniaYucca Valley9228434.159534, -116.42598434.159534-116.425984FemaleNoNoNo12NoNo phone serviceDSLNoYesYesYesYesYesOne yearNoElectronic check60.65743.30No0243740NaN0.0
70272569-WGERO1United StatesCaliforniaLanders9228534.341737, -116.53941634.341737-116.539416FemaleNoNoNo72YesNoNoNoNoNoNoNoNoTwo yearYesBank transfer (automatic)21.151419.40No0455306NaN0.0
70286840-RESVB1United StatesCaliforniaAdelanto9230134.667815, -117.53618334.667815-117.536183MaleNoYesYes24YesYesDSLYesNoYesYesYesYesOne yearYesMailed check84.801990.50No0592140NaN0.0
70292234-XADUH1United StatesCaliforniaAmboy9230434.559882, -115.63716434.559882-115.637164FemaleNoYesYes72YesYesFiber opticNoYesYesNoYesYesOne yearYesCredit card (automatic)103.207362.90No0715560NaN0.0
70304801-JZAZL1United StatesCaliforniaAngelus Oaks9230534.1678, -116.8643334.167800-116.864330FemaleNoYesYes11NoNo phone serviceDSLYesNoNoNoNoNoMonth-to-monthYesElectronic check29.60346.45No0592793NaN0.0
70313186-AJIEK1United StatesCaliforniaApple Valley9230834.424926, -117.18450334.424926-117.184503MaleNoNoNo66YesNoFiber opticYesNoYesYesYesYesTwo yearYesBank transfer (automatic)105.656844.50No0385097NaN0.0